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Abstract: 

Here I will outline successes and challenges for finding spectral line sources in large data cubes that are 
dominated by noise. This is a 3D challenge as the sources we wish to catalog are spread over several 
spatial pixels and spectral channels. While 2D searches can be applied, e.g., channel by channel, 
optimal searches take into account the 3-dimensional nature of the sources. In this overview I will 
focus on Hi 21-cm spectral line source detection in extragalactic surveys, in particular HIPASS, the 
HI Partes All-Sky Survey and WALLABY, the ASKAP HI All-Sky Survey. I use the original HIPASS 
data to highlight the diversity of spectral signatures of galaxies and gaseous clouds, both in emission 
and absorption. Among others, I report the discovery of a 680 kms -1 wide Hi absorption trough 
in the megamaser galaxy NGC 5793. Issues such as source confusion and baseline ripples, typically 
encountered in single-dish Hi surveys, are much reduced in interferometric Hi surveys. Several large 
H I emission and absorption surveys are planned for the Australian Square Kilometre Array Pathfinder 
(ASKAP): here we focus on WALLABY, the 21-cm survey of the sky (8 < +30°; z < 0.26) which will 
take about one year of observing time with ASKAP. Novel phased array feeds ("radio cameras") will 
provide 30 square degrees instantaneous field-of-view. WALLABY is expected to detect more than 
500 000 galaxies, unveil their large-scale structures and cosmological parameters, detect their extended, 
low-surface brightness disks as well as gas streams and filaments between galaxies. It is a precursor for 
future Hi surveys with SKA Phase I and II, exploring galaxy formation and evolution. The compilation 
of highly reliable and complete source catalogs will require sophisticated source-finding algorithms as 
well as accurate source parametrisation. 
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1 Introduction 

In recent years many remarkable galaxy surveys at op- 
tical and infrared wavelengths have become available. 
The Sloan Digital Sky Survey (SDSS; York et al. 2000) 
databases currently hold many millions of galaxies and 
nearly one million optical spectra (Eisenstein et al. 
2011). McConnachie et al. (2009) extracted 29 million 
galaxies from SDSS DR6 to study Hickson Compact 
Groups and identified nearly 400 000 galaxy groups. 
Over 1.5 million galaxies are listed in the 2MASS Ex- 
tended Source Catalog (XSC; Jarrett et al. 2000), and 
a WISE Extended Source Catalog has recently been 
released (Jarrett et al. 2012). Multi-colour optical sky 
surveys such as PanSTARRS (Stubbs et al. 2010) and 
SkyMapper (Keller et al. 2007) are under way. 

The NASA Extragalactic Database (NED) currently 
holds over 100 million objects classified as galaxies 
(Marion Schmitz, priv. comm.), and the Lyon Extra- 
galactic Database (LEDA) contains at least 1.5 million 
bona-fide galaxies. In comparison, the total number 
of Hi 21-cm detected galaxies is small: several tens of 
thousands (Meyer et al. 2004, Springob et al. 2005, 
Wong et al. 2006, Haynes et al. 2011). The intrinsic 
faintness of the electron spin-flip transition of neutral 
atomic hydrogen (rest frequency 1.42 GHz) makes it 
difficult to detect H I emission from individual galaxies 
at large distances. To study the H I content of galax- 



ies and diffuse Hi filaments between galaxies, we need 
radio synthesis telescopes with large collecting areas, 
low-noise receivers and large fields of view. Such re- 
quirements provide a range of engineering and com- 
puting challenges (Chippendale et al. 2010, Cornwell 
2007, Schinckel et al. 2012). 

While we have come a long way since the discovery 
of the 21-cm spectral line by Ewen & Purcell in 1951, 
detecting a Milky Way-like galaxy at redshift z = 1 in 
Hi emission will require the Square Kilometre Array 
(Obreschkow et al. 2011). Several large radio surveys 
have been proposed for SKA pathfinder and precursor 
telescopes and are currently undergoing intense design 
studies. The latter include evaluating and improving 
source-finding algorithms which are needed to extract 
scientifically useful (i.e., highly reliable and complete) 
source catalogs from the large survey volumes. 

This paper is organised as follows: in Section 2 
I will briefly introduce the SKA Pathfinder Hi All- 
Sky Surveys (WALLABY and WNSHS), followed by 
an overview on source finding issues and algorithms 
in Section 3. In Section 4 I use the HI Parkes All- 
Sky Survey (HIPASS; Barnes et al. 2001, Koribalski 
et al. 2004) to illustrate the diversity of galaxy Hi 
profiles and present new detections (see Figs. 1-3). An 
overview of 3D visualisation tools, to explore source 
catalogs and data cubes, is given in Section 5. 
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HIPASS 


WALLABY 


WNSHS 


(1) 


(2) 


(3) 



telescope 64-m Parkes dish ASKAP WSRT 

36 x 12-m dishes 12 x 25-m dishes 
(under construction) (maxi-short) 

baselines 20 m to 2 (6) km 36 m to —2.5 km 



receiver 


21-cm multibeam 


phased array feed phased array feed 






(Chequerboard) (Vivaldi) 


-^sys Spec. 


20 K 


50 K 50 K 


ficld-of-view 


~1 deg 2 (13 beams) 


30 deg 2 8 deg 2 


obs. mode 


scanning 


dithering/mosaicing mosaicing 


angular resolution 


15.5 arcmin 


30 (10) arcsec 30 (15) arcsec 


pixel size 


4 arcmin 


7.5 (2.5) arcsec 10 (4.5) arcsec 


sky coverage 


<5 < +25° 


S < +30° S > +27° 




29 343 deg 2 


30 940 deg 2 11 262 deg 2 


cubes/fields 


538 (8° x 8°) 


1300 1410 


bandwidth 


64 MHz 


300 MHz 


no. of channels 


1024 


16384 


channel width 


13.2 kms" 1 


3.9 kms" 1 


velocity resolution 


18.0 kms" 1 


3.9 kms" 1 


integration time 


450 s 


8 h 4 (12) h 


per pointing 






rms per channel 


—13 mjybeam -1 


1.6 mJybeam -1 1.5 (1.2) mJybeam -1 


frequency coverage 


1362.5 - 1426.5 MHz 


1130 - 1430 MHz 


velocity range (cz) 


-1280 to +12 700 kms- 1 


-2000 to +77 000 kms" 1 




z < 0.04 


z < 0.26 


galaxies 


-6500 


-500 000 -100 000 


duration 


-1997 - 2002 


from 2014? 



Table 1: Comparison of Hi survey parameters for HIPASS, WALLABY and WNSHS. The parameters for 
WALLABY and WNSHS are approximate and may change in future. Notes: (1) Barnes et al. (2001) and 
references therein; (2) Koribalski & Staveley-Smith (2009), see www.atnf.csiro.au/research/WALLABY; 
(3) Jozsa et al. (2010), see www.astron.nl/^jozsa/wnshs. Details are given in Section 2. 
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Figure 1: Integrated Hi distribution overlaid on DSS B-band images (top) and Hi spectra (bottom) for two galaxy 
systems: HIPASS J0354-36a (left) and HIPASS J1239-11 (right). The data shown here are from the HI 
Parkes All-Sky Survey (HIPASS); the gridded beam is 15.5 arcmin. HIPASS J0354-36a encompasses the SO galaxy 
IC2006 and the dwarf galaxy ESO359-G005; the Hi contour levels are 1, 2, 3 and 4 Jybeam -1 kms -1 . HIPASS 
J1239-11 is better known as the Sombrero Galaxy (M 104); the Hi contour levels are 2, 3 and 4 Jy beam -1 kms -1 . 
Optical galaxy positions are indicated with circles. Red markers in the HIPASS spectra indicate fitted H I properties 
such as the peak flux and the 50% and 20% velocity width. The fitted baseline is shown in grey (0th order for 
HIPASS J0354-36a and 5th order for HIPASS J1239-11). 



2 SKA Pathfinder HI Surveys 

ASKAP, the Australian SKA Pathfinder (DeBoer et 
al. 2009), consists of 36 x 12-m antennas and is lo- 
cated in the Murchinson Shire of Western Australia. 
As of June 2012 the construction of all ASKAP anten- 
nas is completed. Of the 36 antennas, 30 are located 
within a circle of ~2 km diameter, while six anten- 
nas are at larger distances providing baselines up to 
6 km. Each ASKAP dish will be equipped with novel 
Chequerboard phased array feeds (PAFs). We expect 
the first six Mkl PAFs (the co-called BETA array) to 
be ready for engineering testing by the end of 2012, 
with science commissioning to be integrated when fea- 
sible. The instantaneous field-of-view of the ASKAP 
PAFs is 5.5 deg x 5.5 deg, ie. 30 square degrees (Chip- 
pendale et al. 2010), making ASKAP a 21-cm survey 
machine. The WSRT APERTIF upgrade employs Vi- 
valdi PAFs, delivering a field-of-view of eight square 
degrees (Verheijen et al. 2008). The system temper- 
ature specification of the ASKAP PAFs is 50 K over 
the full band, with future PAF generations promising 
much higher performance. 



WALLABY, the Widefield ASKAP L-band Legacy 
All-sky Blind surveY (led by Barbel Koribalski & Lis- 
ter Staveley-Smith; see Koribalski et al. 2009), will 
cover 75% of the sky (-90° < 8 < +30°) over a fre- 
quency range from 1.13 to 1.43 GHz (corresponding to 
-2000 < cz < 77,000 kms -1 ) at resolutions of 30" 
and 4 kms -1 . For further details see Table 1. WAL- 
LABY will be carried out using the inner 30 antennas 
of ASKAP, which provide excellent -u^-coverage and 
baselines up to 2 km. High-resolution (10") ASKAP 
Hi observations using the full 36-antenna array will 
require further computing upgrades. 

WNSHS, the Westerbork Northern Sky HI Survey 
(led by Guyla Jozsa; see Joza et al. 2010), will cover 
a large fraction of the northern sky (5 > +27°) with 
APERTIF (Verheijen et al. 2008) over the same fre- 
quency range as WALLABY with ASKAP. Both Hi 
surveys combined will achieve a true all-sky survey 
with unprecedented resolution and depth. The science 
goals of both surveys are well developed and comple- 
ment, as well as enhance, each other. 

WALLABY and WNSHS are made possible by the 
development of phased array feeds, delivering a much 
larger field-of-view than single feed horns or multi- 
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Figure 2: HIPASS spectra of the galaxies IC5063 (top left), NGC 3801 (top right), NGC 612 (bottom left) 
and TXS 2226-184 (bottom right). Harming smoothing was used to improve the signal to noise of the detected 
Hi emission and absorption features (5-point Hanning = 52 kms -1 ; 3-point Hanning = 26 kms -1 ). Fitted Hi 
properties are indicated by red markers in the HIPASS spectra; the fitted 0th order baseline is shown in grey. 



beam systems. A comparison of Hi survey parame- 
ters for HIPASS, WALLABY, and WNSHS is given in 
Table 1. 

WALLABY will take approximately one year (ie 
8 hours per pointing) and deliver an rms noise of 1.6 
mjybeam -1 per 4 kms -1 channel. We estimate that 
more than 500 000 gas- rich galaxies can be individually 
detected within the WALLABY volume (Johnston et 
al. 2008, Duffy et al. 2012, Koribalski et al. 2012). 
The WALLABY team will examine the H I properties 
and large-scale distributions of galaxies out to a red- 
shift of z — 0.26 (equivalent to a look-back time of 
~ 3 Gyr) in order to study: (1) galaxy formation and 
the missing satellite problem in the Local Group, (2) 
evolution and star-formation in galaxies, (3) mergers 
and interactions in galaxies, (4) the Hi mass function 
and its variation with galaxy density, (5) physical pro- 
cesses governing the distribution and evolution of cool 
gas at low redshift, (6) cosmological parameters relat- 
ing to gas-rich galaxies, and (7) the nature of the cos- 
mic web. WALLABY will detect dwarf galaxies (Mm 
= 10 s Mq) out to a distance of ~60 Mpc, massive 
galaxies (M m = 6 x 10 9 Mq) to ~500 Mpc, and super- 
massive galaxies like Malin 1 (Mm = 5 x 10 10 M Q ) to 
the survey 'edge' of 1 Gpc. The mean sample redshift 
is expected to be z = 0.05 (200 Mpc). 



3 Source finding considerations 

Reliability and completeness are of high importance 
when compiling source catalogs as knowing both is es- 
sential for the statistical analysis and interpretation 
of source properties. The process of finding sources 
can be considered one of many important steps (e.g., 
data pre-processing, source finding & characterisation, 
cataloging / post-processing) in the production of as- 
tronomical source catalogs. 

The largest data volumes will come from wide-field 
spectral line surveys such as WALLABY and WN- 
SHS, while radio continuum and polarisation surveys 
are typically an order of magnitude smaller. Apart 
from GASKAP, the Galactic ASKAP Surve£\ (Dickey 
et al. 2012), ASKAP 21-cm surveys will use the full 
300 MHz bandwidth. To achieve ~4 kms -1 velocity 
resolution, the extragalactic Hi surveys, WALLABY, 
WNSHS and also the Deep Investigations of Neutral 
Gas Origins (DINGO) survey, require the 300 MHz 
bandwidth to be divided into 16384 channels. 



1 Within their proposed 7.3 MHz band, centred on 
the Galactic Hi line, GASKAP will also be able to de- 
tect nearby, gas-rich galaxies at high velocity resolution 
(0.25 kms -1 ). This is of great benefit to the kinematical 
study of low-mass dwarf galaxies which rotate slowly and 
often display high turbulence and non-circular motions. 
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Figure 3: HIPASS spectra of the Virgo cluster members NGC 4192, IC3105, NGC 4594/9 (top row, from left 
to right), NGC 4523, UGC 08091, and NGC 4633/4 (bottom row, from left to right). No Hanning smoothing 
was used here. Bright Galactic H I emission is seen as artifacts in all displayed spectra. Fitted H I properties are 
indicated by red markers in the HIPASS spectra; the fitted 0th order baseline is shown in grey. 



In contrast, radio continuum surveys such as EMU, 
the Evolutionary Map of the Universe survey (Norris 
et al. 2011), use the same band divided into 300 chan- 
nels, each 1 MHz wide. EMU and WALLABY are 
likely to jointly survey the sky at 21-cm. While the 
WALLABY team expects to individually detect more 
than half a million gas-rich galaxies within the sur- 
vey volume (3.26 Gpc 3 ; see Section 2), the EMU team 
expects to detect ~70 million sources. 

Huynh et al. (2012) explore 2D source finding 
algorithms such as SFind (Hopkins et al. 2002), S- 
Extractor (Bertin & Arnouts 1996), and Duchamp 
(Whiting 2008; Whiting 2012), optimised for compact 
continuum sources. The high continuum source den- 
sity means that confusion, dynamic range, and source 
identification (which is essential to gather optical red- 
shifts) are key issues. Hancock et al. (2012) look 
into compact continuum source-finding and compare 
S-Extractor, MIRIAD imsad (Sault et al. 1995), 
SFind and Aegean. The Circle Hough Transform is 
explored by Hollitt & Johnston-Hollitt (2012) for the 
detection of extended and diffuse objects such as su- 
pernova remnants, radio galaxies and relics. Molinari 
et al. (2011) look into source extraction and photom- 
etry for continuum surveys at mid- and far-infrared 
as well as sub-millimetre wavelengths. They present 
a new method, CUTEX for CUrvature Thresholding 
Extractor, to detect sources in the presence of intense 
and highly variable fore/background emission, in par- 
ticular in the Galactic Plane (see also Marsh & Jarrett 
2012). 

Neither source confusion nor dynamic range are 



concerns for WALLABY. But, as the 21-cm line of 
atomic neutral hydrogen is intrinsically very faint, find- 
ing and characterising Hi clouds, filaments, and galax- 
ies of various sizes in the 3D data cubes, is difficult. 
The 3D shape of H I sources — in WALLABY galax- 
ies are always extended over numerous velocity chan- 
nels — typically provides a 'contiguous block of vox- 
els' which is distinguishable from white noise. The H I 
emission line traces the warm neutral hydrogen gas 
in galaxies whose observed velocity dispersion is typ- 
ically 10 ± 2 kms -1 (Tamburro et al. 2009). This 
means that an H I emission signal from a galaxy (even 
a face-on galaxy; see Petric & Rupen 2007) will always 
extend over at least two 4 kms wide channels and 
generally over more than three channels. The ASKAP 
synthesized beam (~ 10"/30" for the 6-/2-km arrays, 
respectively) is then sampled by at least three pixels 
in each spatial direction. 

Duchamp (Whiting 2008; Whiting 2012) is one of 
a number of programs for finding and characterising 
astronomical sources in images and data cubes. For 
ASKAP 21-cm data a specific version of Duchamp, 
called Selavy, is being developed in extensive con- 
sultation with the ASKAP Survey Science Projects 
(see Whiting & Humphreys 2012). Basic testing of 
Duchamp using (1) a set of spatially unresolved sources 
with narrow Gaussian spectra and (2) a set of spatially 
resolved sources with double-horn spectra was carried 
out by Westmeier, Popping & Serra (2012). To im- 
prove both the reliability and completeness of source 
finding over a broad range of parameters, a range of al- 
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gorithms, based on wavelets, Kuiper and Kolmogorov- 
Smirnov tests, Bayesian statistics, etc., are being ex- 
plored. 

Among the new 3D algorithms are the Charac- 
terised Noise HI (CNHI) source-finder (Jurek 2012), 
the 2D-1D wavelet reconstruction (Floer &: Winkel 2012), 
and the Smooth & Clip (S+C) technique (Serra et al. 
2012; Popping et al. 2012). A combination of these 
techniques is currently being investigated. Serra, Ju- 
rek & Floer (2012) look into the reliability of source 
finding algorithms using a statistical analysis of the 
noise characteristics in Hi data cubes. Pre-processing 
techniques such as iterative median smoothing and 
wavelet de-noising are discussed by Jurek & Brown 
(2012). A comprehensive comparison of the above 
mentioned source-finding algorithms has been conducted 
by Popping et al. (2012) using a range of thresholds 
and smoothing options. 

Overall, the development of efficient and reliable 
3D source finding tools will provide a major advance 
to the analysis of any large spectral line data set. It is 
very important to be able to search data cubes with- 
out adopting any prior knowledge of source properties 
(e.g., galaxy positions from optical or infrared cata- 
logs). Such an open approach ensures a large variety 
of sources (in the Hi context, these are different types 
of galaxies and H I filaments / clouds) is catalogued 
and is essential to avoid bias towards a narrow range 
of sources. Targeted searches, based on prior knowl- 
edge of the respective sources, would complement the 
open approach and address specific science goals. 

Some surveys require highly specialised algorithms, 
for example searching for maser emission in HOPS, the 
H2 O Galactic Plane Survey (Walsh et al. 2012), detec- 
tion thresholds and bias correction for polarised con- 
tinuum sources (George, Stil & Keller 2012), as well 
as finding and characterising sources in 4-colour data 
from WISE, the Wide-field Infrared Survey Explorer 
(Marsh & Jarrett 2012). Radio transients require yet 
other considerations; Keith et al. (2010) describe so- 
phisticated pulsar searches in single-dish surveys, while 
Bannister & Cornwell (2011) introduce two new algo- 
rithms for the detection of transients in interferometric 
surveys. 

Targeted searches may focus on particular spatial 
and/or spectral source shapes (e.g., very wide double- 
horn profiles, extended filaments or equi-distant re- 
combination lines) using highly optimised search al- 
gorithms with build-in assumptions (so-called priors). 
For example, Allison et al. (2012a, b) use Bayesian 
statistics to search for Hi absorption lines in spectra 
towards bright radio continuum sources. This will be 
of great importance for SKA pathfinder surveys like 
FLASH, the First Large Absorption Survey in HI led 
by E. Sadler. Another example is provided by Hurley- 
Walker et al. (2012) who use Bayesian analysis with 
specific priors to search for the SZ effect from galaxy 
clusters. 

The developments described above are targeted at 
optimising source detection for WALLABY and WN- 
SHS. They will also be useful for deep interferometric 
Hi surveys, such as DINGO (led by M. Meyer), and 



LADUMA (led by S. Blyth & B. Holwerda), single- 
dish Hi surveys, such as HIPASS (Barnes et al. 2001; 
Koribalski et al. 2004), HI J ASS (Lang et al. 2003; 
Wolfinger et al. 2012), and EBHIS (Winkel et al. 
2010), and many other large- volume Hi surveys. 

4 HIPASS source-finding 

The development of a powerful 13-beam receiver sys- 
tem (T sys ~ 20 K) plus versatile correlator on the 64-m 
Parkes telescope instigated an era of large-scale 21-cm 
surveys of our Galaxy and the Local Universe. The 
HI Parkes All-Sky Survey (HIPASS) is the largest and 
most prominent of the Parkes H I surveys. It covers 
the whole sky to a declination limit of S = 25° over 
a velocity range from -1280 to 12700 kms~\ The 
Parkes gridded beam is ~15.5 arcmin (to sample the 
beam adequately a pixel size of 4 arcmin is used) , the 
velocity resolution is 18 kms" 1 , and the rms noise is 
~13 mjybeam -1 per channel (for a typical integra- 
tion time of 8 minutes). See Barnes et al. (2001) for a 
detailed description of the HIPASS observations, cali- 
bration and imaging techniques. 

Current galaxy catalogs include the HIPASS Bright 
Galaxy Catalog (HIPASS BGC; Koribalski et al. 2004; 
Ryan- Weber et al. 2002; Zwaan et al. 2003), the 
southern HIPASS catalog (HICAT; Meyer et al. 2004) 
and the northern HIPASS catalog (NHICAT; Wong et 
al. 2006). Together these catalogs, which are highly 
reliable (Zwaan et al. 2004), contain more than 5000 
Hi-rich galaxies and a few Hi clouds (e.g., HIPASS 
J0731-69; Ryder et al. 2001). In addition, Parkes Hi 
multibeam surveys of the Zone of Avoidance (ZOA) 
have catalogued more than 1000 galaxies (Staveley- 
Smith et al. 1998; Henning et al. 2000; Juraszek et 
al. 2000; Donley et al. 2005). Compact and extended 
populations of Galactic high-velocity clouds (HVCs) 
were catalogued by Putman et al. (2002). 

Because our aim was to produce highly reliable 
HIPASS catalogs, many faint H I sources have not (yet) 
been catalogued (see Figs. 1 & 2). Furthermore, gas- 
rich galaxies with velocities less than ~300 kms -1 lie 
outside the parameter space considered for (N)HICAT 
(see Fig. 3). This limit was chosen to avoid confusion 
with Galactic HVCs. 

An advanced HIPASS data reduction is under way 
(Calabretta et al. 2012, in prep.), aiming to reduce 
on- and off-source spectral ripple as well as improve 
the bandpass calibration, survey sensitivity and source 
parametrisation. Our goal is to obtain much deeper 
HIPASS catalogs by employing the sophisticated source- 
finding algorithms discussed in this PAS A Special Is- 
sue. A comparison of the original HIPASS data with 
the new version for several areas is under way. 

4.1 New HIPASS sources 

In this section, I use HIPASS data to highlight the di- 
versity of spectral signatures of galaxies and gaseous 
clouds, both in emission and absorption. I present 
some previously uncatalogued HIPASS detections of 
galaxies. Their low signal-to-noise and/or low veloc- 



www.pu blish . csiro. au /journals/ pasa 



7 



ity prevented inclusion in the published catalogs which 
were compiled following a blind search and verifica- 
tion based on the HIPASS data alone (see Meyer et 
al. 2004; Wong et al. 2006). The new HIPASS de- 
tections of galaxy groups, pairs and individual spirals 
are shown in Figs. 1-3. HIPASS names are assigned 
as previously, using their fitted Hi position. Detailed 
source descriptions are given in the Appendix, and 
their HIPASS properties are summarised in Table 2. 

Galaxies with notable Hi absorption features de- 
tected in HIPASS are briefly discussed by Koribalski 
et al. (2004; Section 3.6). Prominent examples are 
NGC 253 (HIPASS J0047-25), NGC 3256 (HIPASS 
J1028-43), NGC 4945 (HIPASS J1305-49), Circinus 
(HIPASS J1413-65) and NGC 5128 (HIPASS J1324- 
42). Some galaxies with bright radio nuclei, such as 
NGC 5793 (HIPASS J1459-16A), are easily detected in 
Hi absorption but not seen in Hi emission, highlight- 
ing the fact that such galaxies would be missing from 
any Hi peak-flux limited catalogs despite their sub- 
stantial Hi content. Because of the large Parkes beam 
(15.5 arcmin) HIPASS spectra of individual nearby 
galaxies with bright radio continuum emission may 
show a combination of Hi emission from the galaxy 
disk and Hi absorption against the star- forming nu- 
clear region (examples are shown in Fig. 2 and dis- 
cussed in the Appendix). 

Here I report the discovery of a 680 km s _1 wide H I 
absorption trough in the megamaser galaxy NGC 5793. 
This feature is seen in addition to the well-known nar- 
row Hi absorption line reported by Pihlstrom et al. 
(2000). Figs. 4 & 5 show the respective HIPASS spec- 
tra; further details are given in the Appendix. 

Galaxies in the Virgo cluster have systemic veloci- 
ties typically ranging from -700 to +2700 kms~\ No- 
tably, at velocities around zero, single-dish H I observa- 
tions are confused by Galactic H I emission and HVCs, 
and the galaxy H I properties can be hard to measure 
accurately. The HIPASS spectra of nine Virgo galax- 
ies are shown in Fig. 3 and their properties are sum- 
marised in Table 2. Independent distance estimates to 
Virgo galaxies range from about 16 to 24 Mpc. 

4.2 Statistical techniques 

To study the properties of astronomical sources below 
the survey detection threshold, statistical approaches 
such as stacking and intensity mapping are used. For 
example, Pen et al. (2009) use the HIPASS data to 
co-add Hi spectra at the positions of 27417 optical 
galaxies, after shifting their systemic velocities to a 
common restframe. As expected, this results in a large 
H I signal, dominated by the bright H I emission of in- 
dividually detected galaxies. After removing all galax- 
ies listed in HICAT (Meyer et al. 2004) the co-added 
Hi signal is detected at high significance (Meyer, priv. 
comm.). The deep ASKAP Hi survey DINGO will de- 
tect most H I sources through stacking of spectra from 
already known galaxies with accurate velocities). 

Another technique to study H I emission in and be- 
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Figure 4: HIPASS spectrum towards the Sy2 galaxy 
NGC 5793 (« op t = 3491 kms" 1 ), also known as 
PKS 1456-164 (~1 Jy at 1.4 GHz) and HIPASS 
J1459-16A (Koribalski et al. 2004). We detect the 
well-known deep Hi absorption feature (~3420 to 
3590 kms" 1 ) and a very wide Hi absorption trough 
(~3200 to 3880 kms" 1 ). For a detailed view of the 
latter see Fig. 5. - The Hi emission feature at 
2861 kms^ 1 (HIPASS J1459-16B) is associated with 
the dwarf irregular galaxy 6dF J1459410-164235 (« op t 
= 2857 kms -1 ) and not the E0 galaxy NGC 5796 (w op t 
= 2971 kms" 1 ). 



tween galaxies over large volumes is known as inten- 
sity mapping. Instead of aiming to individually detect 
galaxies, which requires high angular resolution and 
sensitivity, this approach can be employed to measure 
the collective emission of many galaxies at low angu- 
lar resolution (several tens of Mpc). The 21-cm inten- 
sity mapping allows the 3D measurement of large scale 
structures and velocity /flow fields to large redshifts. 

5 Visualisation 

Visual data exploration and discovery is used across 
all sciences with a large range of tools and algorithms 
available. Each discipline requires suitable software to 
analyse their data, the complexity and volume of which 
is growing steadily. Here I look into astronomical soft- 
ware packages and specific algorithms that allow the 
visualisation of galaxy data (incl. numerical simula- 
tions) from individual objects to survey catalogs. 

3D visualisations of several thousand catalogued 
HIPASS galaxies (see Section 4) were created by Mark 
Calabretta and are available on-lins0- The anima- 
tions depict the distribution of gas-rich galaxies in the 
nearby Universe (z < 0.03), taking into account their 
positions, velocities / distances and Hi masses. Large- 
scale structures such as the Supergalactic Plane and 
the Local Void (see Koribalski et al. 2004) are clearly 
visible. In future we should be able to replace each 
source, currently depicted by a sphere, by a 3D multi- 
wavelength rendering of the respective galaxy. 

2 www. atnf.csiro.au/people/mcalabre/animations 
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HIPASS J1459-16A (NGC 5793) 




3500 3000 3500 4000 4500 



Optical Velocity, cz (km g ^) 

Figure 5: A closer look at the megamaser galaxy 
NGC 5793 reveals a very broad H I absorption fea- 
ture, extending from ~3200 to 3880 kms" 1 or -290 
to +390 kms -1 with respect to NGC 5793's systemic 
velocity. This remarkable 680 kms" 1 wide feature has 
not previously been detected and requires further in- 
vestigation. 



Dolag et al. (2008) outline the wide-ranging bene- 
fits of data visualisation, in particular 3D rendering of 
complex data sets, and introduce the SPLOTCH pub- 
lic ray-tracing software (see description below). They 
emphasize the need for tools to visualise scientific data 
in a comprehensive, self-describing, rich and appealing 
way. SPLOTCH is a powerful package to do this (see 
also Jin et al. 2010). 

Multi-wavelength images and spectral line data cubes 
of galaxies allow us to measure their stellar, gas and 
dark matter properties. Visualisation packages such 
as KARMA (see below) provide a range of tools to 
interactively view 2D and 3D data sets as well as ap- 
ply mathematical operations. This allows not only the 
quick inspection and evaluation of multiple images, 
spectra and cubes, but also the production of beau- 
tiful multi-color images and animations. 

To improve our understanding of galaxy formation 
and evolution, we need to include models and theo- 
retical knowledge together with observations of galaxy 
disks and halos. By fitting and modelling the observed 
gas distribution and kinematics of extended galaxy 
disks, we derive their 3-dimensional shapes and ro- 
tational velocities (e.g., Jozsa 2007; Kamphuis et al. 
2011). Visualisation can then be employed to com- 
bine the actual data with our derived knowledge to 
re-construct the most likely 3D representation of each 
galaxy. By adding time as the fourth dimension one 
can also visualize the evolution of galaxies and the Uni- 
verse (e.g., see the 4D Universe visualisation by Dolag 
et al. 2008). 

A large range of astronomical visualisation pack- 
ages is available, most of which are not discussed here. 
VO-compatible applications such as Aladin, SkyView, 
TopCat, and VOPlot are well suited to viewing im- 
ages, spectra and catalogs. Here I briefly highlight a 



few other visualisation tools: 

KARMA is a widely used, interactive software toolki10 
for the exploration and analysis of multi-frequency as- 
trophysical images, spectra and data cubes (Gooch 
et al. 1995, 1996). KARMA is freely available and 
highly versatile, including a diverse range of software 

tools (e.g., KVIS, XRAY, KOORDS, KPVSLICE, KRENZO, 

kshell). The most popular and probably best known 
tool is KVIS, capable to visualise multi- wavelength im- 
ages (FITS format and others) as well as spectral line 
data cubes. The xray program allows to volume ren- 
der, animate, and explore spectral line (e.g., Hi) cubes 
(examples are given in Fig. 6). The strength of KARMA 
lies in the rapid and intuitive inspection of small data 
cubes and images through interactive visualisation. No 
further developments are planned, limiting the useful- 
ness of the package for future data sets. The tools 
(and people) exist to write a much enhanced software 
package, which would be widely used and benefit many 
researchers. 

SPLOTCH is a powerful and very flexible ray-tracer 
software too0 which supports the visualization of large- 
scale cosmological simulation data (Dolag et al. 2008, 
2011; Jin et al. 2010). It is publicly available and con- 
tinues to be enhanced. A small team is currently work- 
ing on supporting the visualization of multi-frequency 
observational data to achieve realistic 3D views and 
fly-throughs of nearby galaxies and galaxy groups. Large- 
scale astrophysical data sets coming from particle-based 
simulations have been successfully explored. 

PARAVlEW is a fully parallel, open-source visualiza- 
tion toolki10, used for analyzing and visualizing cosmo- 
logical simulations (see, e.g., Woodring et al. 2011). 

S2PLOT is an advanced 3D plotting librarjQ with 
support for standard and enhanced display devices (Barnes 
et al. 2006). It provides techniques for displaying and 
interactively exploring astrophysical 3D data sets; for 
examples see Fluke et al. (2010a). 

CHROMOSCOPE is an interactive tooQ that facili- 
tates the exploration and comparison of multi- wavelength 
fits images of astronomical data sets on various scales. 
For example applications see Walsh et al. (2012). 

VlSlVO is an integrated suite of software tools for 
the visualization of astrophysical data tables (Becciani 
et al. 2010). Its web portal, VisIVOWeb, allows users 
to create customized views of 3D renderings (Costa et 
al. 2011). In contrast to SPLOTCH neither VlSlVC0 
nor TIPSY0 are designed to lead to ray-tracing like 

3 www.atnf.csiro.au/computing/software/karma 

4 www.mpa-garching.mpg. de/^kdolag/ splotch 

5 www.paraview. org 

6 astronomy .swin. edu. au/s2plot 

7 www.chromoscope.net 

8 visivoweb. oact. inaf. it/visivoweb 

9 www-hpcc.astro.washington.edu/tools/tipsy/tipsy.html 
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images (Dolag et al. 2008). 

Specific visualisation challenges for WALLABY are 
addressed by Fluke et al. (2010b), and Hassan & Fluke 
(2011). In this PASA Special Issue Hassan, Fluke & 
Barnes (2012) look into real-time 3D volume rendering 
of large (TBytes) astronomical data cubes. 



6 Summary &; Outlook 

The large data volumes (images, cubes, and time se- 
ries) expected from ASKAP and other SKA Pathfind- 
ers will require sophisticated source finding algorithms 
and visualisation tools. I presented an overview on the 
current state of astronomical source finding with em- 
phasis on spectral line source finding for extragalactic 
H I surveys. This snapshot highlights the many devel- 
opments underway (e.g., Jurek 2012, Floer & Winkel 
2012, Allison et al. 2012), exploring new techniques 
and testing these comprehensively (Westmeier et al. 
2012, Popping et al. 2012). Not only are future source 
finding algorithms required to be highly reliable and 
complete, they also have to be fast. The challenge to 
find faint and/or unusual sources remains. To illus- 
trate the difficulty of finding faint Hi emission and/or 
absorption signals in large, noise-dominated data cubes, 
I searched several HIPASS cubes and discussed the 
new detections. The data cubes produced by WAL- 
LABY (8 < +30°; z < 0.26), one of several large 
21-cm surveys planned for ASKAP, will be so large 
that automated source-finding is essential. Visualisa- 
tion will play a major role throughout the process of 
data calibration, source finding and source analysis. 
The algorithms used for the visualisation of large data 
cubes (e.g., to enable data quality control and error 
recognition, to find extended source structures, large- 
scale filaments and voids) are also required to be fast, 
as well as intuitive, interactive, and reliable. I gave an 
overview of some visualisation tools, many of which 
are also under development to allow both the analy- 
sis and interpretation of large data volumes, including 
ASKAP 21-cm surveys. 



Appendix 

The new HIPASS detections are briefly described here. 
Table 2 lists their H I emission and absorption proper- 
ties, as fitted with the MIRIAD task mbspect. 

• HIPASS J0354-36a: extended Hi emission 
(see Fig. 1, left) is detected from the region en- 
compassing the galaxies IC 2006 (v opt = 1381 kms -1 ) 
and ESO359-G005 (« op t = 1399 kms" 1 ). Both 

are members of the Fornax cluster which was 
studied in detail by Waugh et al. (2002). IC2006 
is an SO galaxy with a 5-arcmin diameter Hi 
ring (Schweitzer et al. 1989); the dwarf irregu- 
lar galaxy ESO359-G005 is a gas-rich compan- 
ion. I measure a total Hi flux density of ~9.8 
Jy kms -1 (corresponding to an H I mass of 1.6 x 
10 9 M Q assuming a distance of 26 Mpc), about 
twice the amount detected by Schweitzer et al. 
(1989). Other prominent SO galaxies with (par- 
tial) Hi rings are NGC 1490 (HIPASS J0352-66) 
and NGC 1533 (HIPASS J0409-56), see Ooster- 
loo et al. (2007) and Ryan- Weber et al. (2004), 
respectively. 

• HIPASS J1239-11: wide H I emission (see Fig. 1, 
right) from the Sombrero galaxy (M 104, NGC 4594; 
v opt = 1082 kms -1 ) is detected in HIPASS, even 
though the H I spectrum is strongly affected by 
baseline ripple. Most prominent are the two H I 
peaks of the double-horn spectrum, separated 
by nearly 800 kms -1 . The Hi emission appears 

to extend along the edge-on dust ring of this 
beautiful early-type galaxy (Bajaja et al. 1984). 
There is also a hint of narrow Hi absorption 
(~1138 kms -1 ) against its Sy2 nucleus. 

The following HIPASS galaxies (see Fig. 2) show 
both faint H I emission and absorption features. Their 
parametrisation is difficult and remains tentative. In 
most cases, interferometric Hi data for these sources 
exist in the literature. For further studies of galaxies 
with associated Hi absorption, see van Gorkom et al. 
(1989), Pihlstrom (2001), Taylor et al. (2002), and 
Morganti et al. (2005). 

• HIPASS J1459-16A: wide (680 kms -1 ) and 
narrow (150 kms -1 ) Hi absorption features are 
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Table 2: Summary of Hi emission and absorption properties for the galaxies discussed in this paper as 
derived from the respective HIPASS spectra using the MIRIAD task mbspect. For further details see 
Section 4.1 and the notes on individual galaxies in the Appendix. — Notes: (1) I was able to fit the 
narrow Hi absorption feature in NGC 5793 and give a width estimate of the wide absorption feature. 
(2) In IC5063 I fit the Hi absorption between 2700 and 3100 kms" 1 . (3) The tentative Hi absorption 
feature in TXS 2226-184 requires confirmation. (4) The Hi properties listed for NGC 4192 are from VLA 
observations by Cayatte et al. (1990). 
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detected towards the well-known water mega- 
maser galaxy NGC 5793 (PKS B1456-164; v opt 
= 3491 kins -1 ). Figs. 4 & 5 show the respective 
HIPASS spectra. The Hi velocity range seen 
in absorption, ~3200 to 3880 kms - , is broader 
than that of the known H2O masers (Hagiwara 
et al. 1997) and may indicate a much higher 
rotational velocity of the nuclear ring and con- 
sequently a much larger central mass than pre- 
viously estimated. This is a spectacular discov- 
ery which requires further investigation. There 
are clear similarities to the H I absorption sys- 
tems found in PKS 1814-637 and discussed by 
Morganti ct al. (2011). The much narrower 
but extremely deep Hi absorption line seen to- 
wards NGC 5793 is discussed by Pihlstrom et 
al. (2000); see also Gardner & Whiteoak (1986) 
and Koribalski et al. (2004). — NGC 5793 has 
several neighbours, two of which are detected 
in HIPASS: the Sb galaxy NGC 5815 (v op t = 
2995 kms -1 ) and the dlrr galaxy 6dF J1459410- 
164235 (v opt = 2857 kms" 1 ). The E0 galaxy 
NGC 5796 is probably Hi poor. 

HIPASS J2052-57: Hi emission and blue- 
shifted Hi absorption is detected towards the 
Sy2 galaxy IC5063 (PKS B2048-572; v op t = 
3402 kms -1 ). The latter is indicative of fast 
gas outflow; for a detailed discussion see Mor- 
ganti et al. (1998) and Oosterloo et al. (2000). 
Note that RFI at 1408 MHz (~2650 kms -1 ) also 
affects the displayed HIPASS spectrum. 

HIPASS J1140+17: wide Hi emission and 
narrow Hi absorption is detected towards the 
FR-I galaxy NGC 3801 (PKS B1137+180; v op t 
= 3494 kms -1 ). For a detailed discussion of 
NGC 3801 and its gas-rich environment see Emonts 
et al. (2012). Six galaxies and two Hi clouds 
contribute to the emission of HIPASS J1140+17 
(from ~3100 to 4000 kms -1 ). 

HIPASS J0133-36: wide H I emission and nar- 
row H 1 absorption is also detected towards the 
FR-II radio galaxy NGC 612 (PKS 0131-36; v op t 
= 8925 kms -1 ). The mid-point of the Hi emis- 
sion agrees with that of the narrow H I absorp- 
tion (~8789 kms -1 ). For a detailed study see 
Emonts et al. (2008). We note a ~100 kms" 1 
offset to their Hi absorption line measurement. 
The HIPASS Fhi measurement, comprising NGC 612 
and neighbouring galaxies, is ~13 Jy kms -1 , 
corresponding to 4.7 x 10 10 M (assuming D 
= 125 Mpc). 

HIPASS J2229-18: weak H I absorption over 
a wide velocity range, from ~6900 to 7650 km s -1 , 
appears to be detected towards the gigamaser 
galaxy TXS 2226-184 (v op t = 7551 kms" 1 ). With 
the majority of the absorption clearly blue-shifted 
with respect to the systemic velocity, gas outflow 
is a likely explanation. VLA H I measurements 
reveal a much narrower (420 kms -1 ) absorption 
feature (Taylor et al. 2002, 2004). Further Hi 
data are needed to confirm this result. 



HIPASS detections of galaxies in and near the Virgo 
cluster (see Fig. 3) are discussed below. Extensive H I 
studies were carried out by Warmels (1988a,b), Cay- 
atte et al. (1990) and Chung et al. (2009). 

• HIPASS J1213+14: Cayatte et al. (1990) re- 
port on VLA H 1 observations of the large spiral 
galaxy M98 (NGC 4192). They find the gas 
distribution to be warped, extending nearly 15 
arcmin in diameter over a velocity range from 
about -360 to +120 kms" 1 (see also Chung et al. 
2009). Assuming a distance of 16 Mpc, we derive 
an Hi mass of 4.6 x 10 9 M . Our HIPASS spec- 
trum is highly contaminated by bright Galactic 
H 1 emission in the velocity range from about - 
100 to +60 kms -1 as well as HVCs at higher 
velocities. 

• HIPASS J1217+12: IC3105 is a small edge- 
on galaxy at a distance of 14 Mpc. Our Fhi value 
agrees well with the Arecibo Hi measurements 
by Schneider et al. (1990). We derive an Hi 
mass of 3.2 x 10 8 M©. 

• HIPASS J1221 + 11: NGC 4294/9 is an inter- 
acting galaxy pair with extended H I emission 
from both galaxies (Chung et al. 2009). The 
projected separation between NGC 4294 (« sys = 
363 kms -1 ) and NGC 4299 (v sys = 227 kms -1 ) 
is 5.6 arcmin. Our HIPASS Fm measurement 
of approximately 44 Jy kms" 1 for the system 
(HIPASS J1221+11; Wong et al. 2006) agrees 
well with the Fhi values (27 and 18 Jy kms -1 ) 
given by Chung et al. from VLA Hi data, but 
is likely an underestimate. 

• HIPASS J1233+15: NGC 4523 is Magellanic 
dwarf galaxy at a distance of 17 Mpc. Our 
Fhi value agrees well with previous H I measure- 
ments. We derive an Hi mass of 1.4 x 10 9 M . 

• HIPASS J1242+14: NGC 4633/4 is an in- 
teracting galaxy pair detected with the WSRT 
in Hi emission over a velocity range from -50 
to +410 kms -1 by Oosterloo & Shostak (1993). 
The projected separation between NGC 4634 (v sys 
= 115 kms -1 ) and NGC 4633 (v sys = 297 kms -1 ) 
is 3.8 arcmin. Our HIPASS Fhi measurement of 
12 Jy kms -1 is slightly lower than the sum of 
the Fhi values (7.4 and 5.8 Jy kms -1 ) given 
by Oosterloo & Shostak. Our underestimate is 
due to Galactic Hi emission at velocities below 
+ 140 kms -1 affecting our ability to estimate the 
blue-shifted H I emission from NGC 4634. 

• HIPASS J1258+14: UGC 08091 is a dwarf ir- 
regular galaxy at a distance of about 2 Mpc. It 
lies in the foreground of the Virgo cluster. Our 
Fhi value agrees well with previous H I measure- 
ments. We derive an Hi mass of 8.5 x 10 6 M Q . 
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